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Abstract 

Evidence is presented for the associated production of a single top quark and W bo- 
son in pp collisions at \fs = 7 TeV with the CMS experiment at the LHC The ana- 
lyzed data corresponds to an integrated luminosity of 4.9 fb -1 . The measurement 
is performed using events with two leptons and a jet originated from a b quark. A 
multivariate analysis based on kinematic properties is utilized to separate the tt back- 
ground from the signal. The observed signal has a significance of 4.0 a and corre- 
sponds to a cross section of I6+4 pb, in agreement with the standard model expecta- 
tion of 15.6 ± 0.4+^ pb. 
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The electroweak production of single top quarks, first reported by the DO [1 j and CDF [2] exper- 
iments at the Tevatron, has also been observed at the Large Hadron Collider (LHC). Single-top- 
quark production proceeds via three processes: the i-channel exchange of a virtual W boson, 
the associated production of a top quark and a W boson (tW), and the s-channel production 
and decay of a virtual W boson. The ATLAS and Compact Muon Solenoid (CMS) experiments 
have measured the cross section for £-channel production J3j |U while evidence for tW asso- 
ciated production has been presented by the ATLAS experiment 10. This Letter presents the 
evidence from the CMS experiment of tW production in pp collisions at y/s = 7 TeV. 




Figure 1: Leading order Feynman diagrams for single-top-quark production in the tW mode. 

The production cross section for tW is too small to be measured at the Tevatron, but significant 
at the LHC where it exceeds the s-channel production rate. The approximate next-to-next-to- 
leading order (NNLO) theoretical prediction of the cross section for tW in pp collisions at y/s = 
7 TeV is 15.6 ± 0.4! j 2 P D [6], assuming a top-quark mass (nit) of 172.5 GeV. 

The leading order Feynman diagrams for tW production are shown in Fig. [lj The definition 
of tW production in perturbative QCD poses conceptual problems as it mixes with top quark 
pair production (tt) at next-to-leading order (NLO) (3111 ■ Two schemes are proposed to de- 
scribe the tW signal: "diagram removal" (DR) [9|, where all NLO diagrams which are doubly 
resonant, such as those in Fig. |2j are excluded from the signal definition; and "diagram sub- 
traction" (DS) [|9l H0l , in which the differential cross section is modified with a gauge-invariant 
subtraction term, that locally cancels the contribution of tt diagrams. The DR scheme is used 
in this Letter, but it has been verified that the number of predicted events after full selection 
is consistent between the two approaches within the statistical uncertainties of the simulated 
samples. The differences are accounted for in the systematic uncertainties. 



a t 9 t 1 t 




Figure 2: Feynman diagrams for tW single-top-quark production at next-to-leading order that 
are removed from the signal definition in the DR simulation scheme. 

In the standard model, top quarks decay almost exclusively to a W boson and a b quark. The 
study presented here has been performed in the channels in which both W bosons decay lep- 
tonically into a muon or an electron and a neutrino, with a branching fraction £>(W — > Iv) = 
(10.80 ± 0.09)%, where I = e or }i JTT|. The dilepton final states of the tW process are character- 
ized by the presence of two isolated leptons with opposite charge, a jet from the fragmentation 
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of a b quark, and a substantial amount of missing transverse energy (E™ 1SS ) due to the presence 
of the neutrinos. The primary source of background events arise from tt production, followed 
by Z/7* +jets processes. 

The analysis uses fits to a discriminant variable built from kinematic quantities combined with 
a multivariate technique. A second analysis, intended as a cross-check of the robustness of the 
selection, is performed using event counting. In both cases, a sample collected at y/s = 7 TeV by 
CMS, corresponding to an integrated luminosity of 4.9 fb _1 , is used. 

The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diam- 
eter, providing a magnetic field of 3.8 T. Within the field volume are a silicon pixel and strip 
tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass/ scintillator hadron 
calorimeter. Muons are measured in gas-ionization detectors embedded in the steel return 
yoke. Extensive forward calorimetry complements the coverage provided by the barrel and 
endcap detectors. A more detailed description can be found in Ref. ffl2| . 

Single-top-quark events in all channels have been simulated with the POWHEG event generator 
version 301 |Tl3l , designed to describe the full NLO properties of these processes, while Mad- 
GRAPH 5.1.1 [14] is used for tt and for the inclusive single-boson production (V+X), where 
V = W, Z and X can indicate light or heavy partons. The remaining background samples are 
simulated using PYTHIA version 6.4.24 lfl~5l. including diboson production and QCD multijet 
production enriched in events with electrons or muons produced in the decay of b and c quarks, 
and muons from the decay of long-lived hadrons. The CTEQ 6.6M parton distribution function 
sets [[161 are used for all simulated samples. All generated events undergo a full simulation 
of the detector response according to the CMS implementation of Geant4 |l7l[T8| . The value 
used for the top-quark mass is m t =172.5 GeV. 

Approximate NNLO theoretical predictions are used to normalize tt production 
(a t - t = 163+IJ pb) HH, W+jets and Z/ 7* +jets processes are normalized to complete NNLO 
calculations for the inclusive cross sections, and NLO cross sections are used for diboson pro- 
cesses 1120 1 . Unless otherwise stated, the theoretical values of the cross section have been used 
in this Letter to normalize the simulation in figures and tables. 

Leptons, jets and E™ 1SS are reconstructed by the CMS particle flow (PF) algorithm ||2"T), which 
performs a global event reconstruction and provides the full list of particles identified as elec- 
trons, muons, photons, and charged and neutral hadrons. 

Events are collected using dilepton triggers with electrons or muons; the highest lepton trans- 
verse energy threshold used in these triggers is 17 GeV. The two selected leptons must originate 
from the same primary vertex. Muon (electron) candidates are required to have a transverse 
momentum pj > 20 GeV and pseudorapidity \rj\ < 2.4 (2.5); events with additional leptons 
are vetoed. 

To remove low invariant mass Drell-Yan (Z/ 7*) events, the invariant mass of the lepton pair 
(ni£t) is required to be greater than 20 GeV. In the ee and final states, events are also re- 
jected if ni(£ is between 81 and 101 GeV, compatible with the Zboson mass; this veto removes 
background from Z/ 7* +jets, as well as from ZZ and WZ processes. In the eeand ]iy. decay 
channels, a requirement is applied on the E™ 1SS as well to further reduce the contribution from 
events without genuine E^ 1SS (mostly Z/ 7* +jets and QCD multijet production). Since the E™ ss 
resolution is degraded in events with a large number of multiple interactions (high-pileup sce- 
narios), an additional quantity is used (tracker-Eij? lss ), calculated using only the charged parti- 
cles associated with the primary vertex. Events are selected if both E™ ss and tracker-E™ lss are 
larger than 30 GeV. 
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Jets are defined according to the anti-fcj algorithm l|22l with a distance parameter of 0.5. Jets 
within \rj\ < 2.4 and with pj >30GeV are considered in the analysis. 

Exactly one jet is required to be present in the event, and it must be identified as coming from 
a b quark. The identification of b jets is done according to an algorithm that reconstructs the 
secondary vertex of the decay of the b quark |23l l24| . resulting in a discriminating variable 
sensitive to the lifetime of b hadrons. The selection on this discriminant yields a b-tagging 
efficiency of 62% with a mistag rate of 1.4% for jets with pj between 50 and 80GeV. Events 
with additional b-tagged jets with pj > 20 GeV are removed. After this selection, the sample is 
dominated by tt events and tW signal. 

To reduce the dependence on the modeling of the tt background, events with exactly two jets, 
in which either one or both jets have been b tagged, are used in the statistical fit to constrain 
the contribution of this background in the signal region. Three regions are defined per dilepton 
final state: one region with one jet that is b tagged (ljlt) where the tW signal is substantial, and 
two regions with two jets, where the tt background is dominant, and exactly one or two b tags 
are required (2jlt and 2j2t, respectively). 

A smaller background comes from Z/ 7* events. It is found that in high-pileup scenarios the 
E™ 1SS distribution for Z/7* events is not properly modeled by the simulation, leading to dis- 
agreement between data and simulation. To solve this problem, the Z/7* simulation is cor- 
rected to match the missing transverse energy distribution observed in the data using events 
from the Z resonance. 

The contributions of other backgrounds, i.e., diboson production (WW, WZ, ZZ), QCD, W+jets, 
and other single- top-quark processes, are small, less than 1% of the selected events, and esti- 
mated from simulation. 

Table 1: Event yields in the signal (ljlt) and control regions for data and simulation. The 
simulation is quoted with statistical and systematic uncertainties. 





ljlt 


2jlt 


2j2t 


tw 


336±5±16 


180±3±16 


45±1±6 


tt 


1263±19±138 


2775±28±205 


1488±21±222 


Z/7* +jets 


128±12±28 


113±10±22 


8.5±1.8±1.8 


Other 


19±3 


8.8±0.7±0.2 


4±3 


Total estimated 


1746±23±141 


3077±30±207 


1546±21±222 


Total data 


1699 


2878 


1507 



The number of events in the signal and two control regions is presented for data and simulation 
in Table [l] The approximate composition of the sample at this level is 70% tt events with 20% 
tW events in the signal region. In the 2jlt region the tt content represents 90% of the events, 
while tW events are less than 6%. In the 2j2t region, more than 95% of the events are tt events. 

A multivariate analysis based on boosted decision trees ("BDT" analysis) [25. 26 J is used, test- 
ing the overall compatibility of the signal event candidates with the event topology of the tW 
associated production. Four variables are chosen to train the BDT based on their ability to 
separate the tW signal from the dominant tt background. These variables are Hj, defined as 
the scalar sum of the transverse momenta of the leptons, jet, and E™ 1SS ; the pj of the system 
composed of the leptons, Eip iss and jet; the pj of the leading jet; and the difference in angular 
separation, cp, between the direction associated to the E™ 1SS and the closest of the two selected 
leptons. The distributions of Hj and the pj of the system composed of the leptons, E™ 1SS and 
jet, are presented, in the signal region (ljlt), in Fig. [3] The presence of the tW signal over the 
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Figure 3: Distributions of Hj and the pj of the system composed of the leptons, E™ 1SS and jet, 
in data and simulation after jet selection in the signal region (ljlt). 

background is visible in all the distributions. 




100 - 

I I ' ' " 

(jC J 

-1.0 to -0.99 -0.99 to -0.7 -0.7 to 0.7 0.7 to 0.99 0.99 to 1.0 

BDT discriminant 

Figure 4: Distribution of the BDT discriminant in the signal region (ljlt) in data and simulation. 

The output of the BDT is a single discriminant value for every event ranging from — 1 (background- 
like) to +1 (signal-like). The distribution of the BDT discriminant is shown for the ljlt signal 
region in Fig. |4j Even if the tW signal does not peak strongly at +1, its distribution has some 
distinction with respect to those from tt and other backgrounds. Maximum signal sensitivity is 
achieved through a simultaneous fit to the three BDT discriminant shapes (ljlt, 2jlt, and 2j2t); 
the two tt enriched regions are included to control the rate of this background in the signal 
region. 

The impact of each individual source of uncertainty on the analysis has been estimated in every 
region and final state. The dominant systematic uncertainty that affects the rate of the tW signal 
is associated with the b-tagging efficiency, with values between 3% and 6% for the different final 
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states. The b-tagging efficiency uncertainty is also important for the tt background yield, with 
values between 1.5% and 4.0%. The main systematic uncertainty for the tf background is due to 
the factorization/ renormalization scale used in the simulation, up to 11%, with values around 
2% for tW signal. Also for tt, the uncertainties due to jet energy scale (7%) and the threshold 
used to match the matrix element generator to the parton shower model in simulation (3%) 
are important. The statistical uncertainty is the largest contribution to the uncertainty of the 
measured cross section, with a 20% effect. 

A binned likelihood fit is performed on the distributions of the BDT discriminant. Template 
shapes for the signal and backgrounds are taken from simulation. Distributions are included 
separately in the fit for each of the three dilepton channels (ee, e}i, and }i}i) in the signal region 
(lj It) and control regions (2jlt and 2j2t). Signal and background rates are allowed to vary in the 
fit and a signal rate with 68% confidence level (CL) interval is determined using the profile like- 
lihood method. The theory sources of systematic uncertainty that affect the template shape are 
then considered. For each uncertainty, ±1 cr systematic shifts are applied to the simulated sam- 
ples to obtain revised templates. Differences in signal rate found using the revised templates 
are taken as systematic uncertainties and are added in quadrature to the 1 a interval from the 
fit using the baseline templates. The significance is calculated using a profile likelihood ratio as 
test statistic. The expected significance is evaluated using the median and ±1 a interval of sig- 
nificance values obtained from pseudo-experiments generated using the theoretical prediction 
of the standard model tW cross section. 

An excess of events over the expected background is observed with a significance of 4.0 cr, 
compatible with the expected significance of the tW signal, 3.6^Qg cr. The measured cross sec- 
tion, including both statistical and systematic uncertainties, is 16+^ pb, in agreement with the 
standard model prediction. 

The measurement can be used to determine the absolute value of the Cabibbo-Kobayashi- 
Maskawa matrix element \Vfo\, following the same technique as in Q, assuming that | V t d| and 
| Vts | are much smaller than | Vjj, | : 



where crfy is the standard model prediction computed assuming | V t b| = 1. Using the standard 
model assumption of < | Vtb| 2 < 1/ a value of | V t b| = 1-00 is inferred, with a 90% confidence 
level interval of [0.79,1.00]. This is based on profile likelihood intervals, the same method used 
for the cross section measurement and intervals. 

A second analysis ("count-based" analysis), used as a cross-check, is performed using event 
counts. After the jet selection step, instead of building the BDT discriminant, events are re- 
quired in addition to have Hj > 60 GeV in the e}i channel, where no invariant mass and E™ 1SS 
requirements are applied. 

The count-based analysis uses a statistical model of Poisson event counts in the three dilepton 
final states in the signal region (ljlt) and control regions (2jlt and 2j2t). The number of events 
selected in data in the signal region is 1606, and compares with 1671 ± 22 (stat.) ± 134 (syst.) 
predicted by the simulation. For the 2jlt control region, 2766 events are selected in data and 
2989 ± 30 (stat.) ± 200 (syst.) events are predicted; while for the 2j2t region, 1448 data events are 
selected with 1485 ± 21 (stat.) ± 211 (syst.) predicted. 




(1) 



The event yield for each process in every region is affected by different sources of systematic 
uncertainties, equivalent to the ones calculated for the BDT analysis. These are included in 
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the model as nuisance parameters. The same methods for the cross section measurement and 
the significance calculation as in the BDT analysis have been used. Figure [5] shows the event 
yields selected by the count-based analysis for each region, in data and simulation, in which 
the simulation yields have been normalized to the outcome of the maximum likelihood fit. The 
observed significance of the tW signal obtained with the count-based analysis is 3.5 a, with 
an expected significance of 3.2 ± 0.9 a. The count-based analysis measures a cross section of 
15 ± 5 pb. These results are consistent with those obtained with the BDT analysis. 




Figure 5: Event yields in data and simulation in the signal region (ljlt) and the two tt-enriched 
control regions for the count-based analysis. Simulation yields are scaled to the outcome of the 
statistical fit. 

In summary, using 4.9 fb of data collected with the CMS experiment at the LHC, evidence has 
been found for the associated production of a single top quark and W boson in pp collisions at 
y/s = 7TeV with a significance of 4.0 a and a measured cross section of 16^ pb. 
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A Supplemental Information 

This document presents additional material to the publication. Figure[6]shows the distributions 
of the two variables of the BDT not presented before, and Tables [2] and [3] contain information 
related to the systematic uncertainties that affect the analysis. 

The distributions of the pj of the leading jet and the difference in angular separation, cp, be- 
tween the direction associated to the E!p iss and the closest of the two selected leptons, in data 
and simulation, are presented in the signal region (ljlt) in Fig. [6] 




P T of the leading jet [GeV] a <|> (E™ ss , closest lepton) [rad.] 



Figure 6: Distributions of the pj of the leading jet and the difference in <p between the direction 
associated to the E™ 1SS and the closest lepton in data and simulation after jet selection in the 
signal region (ljlt). 



Table [2] presents the impact of each individual source of systematic uncertainty on the rate of 
the different processes in the signal region for each final state. When two numbers are listed for 
a single uncertainty, the upper number is the effect on the rate when the systematic uncertainty 
source is scaled up and the lower is for when it is scaled down. Entries marked with a "-" either 
do not apply for that particular final state/ process, or have a negligible effect. Other processes 
refers to Z/ 7* and the rest of backgrounds, that have almost negligible contributions. 

Table |3]presents the contribution to the uncertainty of the measured cross section of the differ- 
ent sources of uncertainty considered in the analysis. This is estimated by fixing each source 
one at a time and measuring the effect in the cross section uncertainty. 
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A Supplemental Information 



Table 2: Impact of each individual source of systematic uncertainty on the rate of the different 
processes in the signal region for each final state. Other processes refers to Z/7* and the rest of 
backgrounds, that have almost negligible contributions. 



Systematic Uncertainty (ee / &]i / 


tw 


tt 


Other processes 




(%) 


(%) 


(%) 


Luminosity 


2.2/2.2/2.2 


2.2/2.2/2.2 


2.2/2.2/2.2 


Pileup modeling 


2.1/0.6/0.2 


1.0/0.5/0.3 


4.9/1.6/3.7 


Electron Trigger Efficiency 


1.5/1.1/- 


1.5/1.1/- 


1.5/1.1/- 


Muon Trigger Efficiency 


-/l. 1/1.5 


-/l. 1/1.5 


-/l. 1/1.5 


Electron Identification 


2/2/- 


2/2/- 


2/2/- 


Muon Identification 


-/l/l 


-/l/l 


-/l/l 


b-tagging 


+2.6 /+4.1 /+4.1 
-4.0' -3.7' -3.3 


+3.2 /+2.5 z+2.6 
-1.1' -3.3' -3.4 


+3.9 /+3.4 /+3.2 
-7.0' -3.6' -3.5 


Jet Energy Scale 


-1.1 /-0.6 /-1.8 
+0.4' +1.7' +2.1 


-4.3 /-4.7 /-3.6 
+7.6/ +3.7' +5.5 


+13.2 /-1.2 /+11.2 
+1.7 ' -0.1' +10.7 


Jet Energy Resolution 


+1.2 /+0.1 7+0.2 
+1.4' +0.2' +0.8 


+2.9 /-0.5 /+2.9 
+0.5 ' -2.2' +0.1 


+8.0 7-4.4 /+1.4 
-5.3' +7.7' -11.9 


E™ 1SS modeling 


-0.6 / /+1.0 
+1.0' ' -0.2 


+0.2 / /+0.3 
-0.9' ' +0.5 


+2.4 /-0.1 /+13.5 
-7.7' +0.2' -2.5 


Factorization and Normalization scale Q 2 


+3.1 /+3.4 /+3.3 
+0.4' +0.3' +0.4 


+ 10.0 7+10.1 7+10.1 

-6.5 ' -6.9 ' -6.9 


-/-/- 


ME /PS Matching Thresholds 


-/-/- 


+0.7 /+0.5 /+0.4 
+2.0 7 +1.6 7 +1.8 


-/-/- 


tW DR/DS scheme 


-5.4 /+6.4 /+1.7 
-0.5 7 -0.5/ -0.5 


-/-/- 


-/-/- 


PDF uncertainties 


2.2/2.0/2.0 


-/-/- 


-/-/- 


tt cross-section 


-/-/- 


+6.2 /+6.2 /+6.2 
-5.8' -5.8/ -5.8 


-/-/- 


Z/7* modeling 


-/-/- 


-/-/- 


30.5/1.2/23.5 


Simulation Statistics 


3.8/1.8/2.7 


4.5/2.0/2.9 


18.0/12.0/12.4 



Table 3: Contribution to the uncertainty of the measured cross section of the different sources 
of uncertainty considered in the BDT analysis. 



Systematic Uncertainty 


Act (pb) 




Luminosity 


0.69 


0.04 


Pileup modeling 


0.24 


0.02 


Electron trigger efficiency 


0.35 


0.02 


Muon trigger efficiency 


0.38 


0.02 


Electron identification 


0.70 


0.04 


Muon identification 


0.45 


0.03 


b-tagging 


0.30 


0.02 


Jet Energy Scale 


2.42 


0.15 


Jet Energy Resolution 


0.58 


0.04 


E™ 1SS modeling 


0.40 


0.05 


tW Q 2 


0.34 


0.02 


ttQ 2 


0.29 


0.02 


ME/PS Matching Thresholds 


1.62 


0.10 


tW DR/DS scheme 


0.94 


0.06 


PDF uncertainties 


0.34 


0.02 


tt cross section 


0.96 


0.06 


Z/7* modeling 


0.67 


0.04 


Statistical 


3.33 


0.21 


Total 


4.95 


0.31 
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